Multiversion Concurrency Control (MVCC)

Introduction

Multiversion concurrency control (MVCCClosed Multiversion Concurrency Control. This is a database optimization technique that creates duplicate copies of records so that data can be safely read and updated at the same time.) is method that allows a user to have a concurrent and persistent view of distributed transactions across partitions. GigaSpaces keep multiple versions of modified entries to ensure that a user has a persistent view of the data that is consistent with the SoR.

Processing a large number of simultaneous transactions in Smart DIHClosed Smart DIH allows enterprises to develop and deploy digital services in an agile manner, without disturbing core business applications. This is achieved by creating an event-driven, highly performing, efficient and available replica of the data from multiple systems and applications, requires an extreme write throughput that cannot be paused. In order to maintain transactions in the platform, SpaceClosed Where GigaSpaces data is stored. It is the logical cache that holds data objects in memory and might also hold them in layered in tiering. Data is hosted from multiple SoRs, consolidated as a unified data model. objects must not be locked. The MVCC mechanism provides an efficient solution, allowing massive updates while maintaining consistency in the Space with the systems of record (SoR). In this manner, the ACIDClosed In the context of databases and data storage systems, a transaction is any operation that is treated as a single unit of work, which either completes fully or does not complete at all, and leaves the storage system in a consistent state. ACID is an acronym that refers to the set of 4 key properties that define a transaction: Atomicity, Consistency, Isolation, and Durability. If a database operation has these ACID properties, it can be called an ACID transaction. properties of transactions are maintained, ensuring the consistency and integrity of the data before and after each update, even in highly available distributed systems.

For more information about the MVCC mechanism and how it is used in Smart DIHClosed Digital Integration Hub. An application architecture that decouples digital applications from the systems of record, and aggregates operational data into a low-latency data fabric., read our blog on How to Achieve ACID Compliance on Distributed, Highly Available Systems (search for MVCC).

MVCC Flow

The diagram below shows the process of how an update that is coming from the SoR to a CDCClosed Change Data Capture. Primarily used for data that is frequently updated, such as user transactions stream travels through the DIClosed The Data Integration (DI) layer is a vital part of the Digital Integration Hub (DIH) platform. It is responsible for a wide range of data integration tasks such as ingesting data in batches or streaming data changes. This is performed in real-time from various sources and systems of record (SOR. The data then resides in the In-Memory Data Grid (IMDG), or Space, of the GigaSpaces Smart DIH platform. Layer and is finally updated in the Space.  

  • In the Space, only the area written in pink would be visible to the user.

  • All the newer updates (in blue) would be occurring on top of the data and is not visible to the user.  

  • When the update is applied (by the DI) then that data will be visible to the end user. And so the MVCC update cycle continues.

 

MVCC Configuration Properties

Name Type Default Value Description
space-config.mvcc.enabled Boolean false MVCC is enabled for the Space
space-config.mvcc.space-config.mvcc.historical_entry_lifetime Integer 5 Time limit for holding entry version in the cache. Main measure for “should particular entry version be cleaned or not“
space-config.mvcc.historical_entry_lifetime_timeunit TimeUnit m Measure of time limit (millis(ms), seconds(s), minutes(m)…)
space-config.mvcc.historical_entries_limit Integer 5 Max allowed limit for historical entries number per UID. CANNOT BE 0. Data lifetime take precedence over this criteria. (if number in cache < limit, but some entries are too old - purge them)
space-config.mvcc.fixed_cleanup_delay_millis Integer 1000000 Timeout between cleanup iterations. To enable dynamic delay based on previous cleanups set to 0.

The configuration settings for MVCC can be modified to tweak the impact on memory consumption.

Configuring a Space for MVCC

MVCC cannot be configured for a Space that is already Active.  To enable MVCC a new Space has to be created.

To enable a Space for MVCC, perform the following steps:

  1. Add a new Space by following steps as outlined in the User Guide: SpaceDeck - Spaces - Adding a Space

  2. In the Adding a New Space Parameters section, to enable MVCC add the following Context Properties/Property Name: space-config.mvcc.enabled=true

  3. To change any of the other default parameters, additional Properties Names should to be added.

  4. Once completed, click Create Space.

Querying an MVCC Enabled Space 

Limitations - Partial Support

Performance Impact

  • The number of transactions (throughput) is decreased by 5-7%.

  • MVCC adds on average a 25% RAM overhead.